proposition 4
Fast Bellman Updates for Wasserstein Distributionally Robust MDPs
Markov decision processes (MDPs) often suffer from the sensitivity issue under model ambiguity. In recent years, robust MDPs have emerged as an effective framework to overcome this challenge. Distributionally robust MDPs extend the robust MDP framework by incorporating distributional information of the uncertain model parameters to alleviate the conservative nature of robust MDPs.
- Asia > China > Hong Kong (0.04)
- North America > United States > Massachusetts (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.92)
- Government (0.46)
- Leisure & Entertainment > Games (0.46)
- North America > United States > Colorado (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > Singapore (0.04)
- Asia > China > Zhejiang Province (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
Data-driven Optimal Filtering for Linear Systems with Unknown Noise Covariances
This paper examines learning the optimal filtering policy, known as the Kalman gain, for a linear system with unknown noise covariance matrices using noisy output data. The learning problem is formulated as a stochastic policy optimization problem, aiming to minimize the output prediction error. This formulation provides a direct bridge between data-driven optimal control and, its dual, optimal filtering.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)